{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# **Machine Learning Workflow Orchestration**\n",
    "\n",
    "Orchestration refers to the coordination and management of various tasks, resources, and processes involved in the end-to-end machine learning lifecycle. This includes:\n",
    "\n",
    "1. Data Preparation and Management\n",
    "2. Model Training\n",
    "3. Experimentation and Evaluaiton\n",
    "4. Model Deployment\n",
    "5. Monitor and Management\n",
    "6. Automation of repetitive tasks\n",
    "\n",
    "### **Introducing Prefect**  \n",
    "Prefect is an open-source orchestration and observability platform that empowers developers to build and scale resilient code quickly, turning their Python scripts into resilient, recurring workflows.\n",
    "\n",
    "Prefect streamlines the orchestration of machine learning workflows by providing a flexible, scalable, and reliable framework for building, deploying, and managing complex data pipelines with ease. It empowers data scientists and engineers to focus on building machine learning models and solving business problems while abstracting away the complexities of workflow management and execution.\n",
    "\n",
    "Prefect versions:\n",
    "- Prefect 1.x AKA Prefect Core\n",
    "- Prefect 2.x AKA Prefect Orion\n",
    "\n",
    "### **Why Prefect?**\n",
    "- Python based open source tool  \n",
    "- Manage ML Pipelines  \n",
    "- Schedule and Monitor the flow  \n",
    "- Gives observability into failures  \n",
    "- Native dask integration for scaling (Dask is used for parallel computing)\n",
    "\n",
    "\n",
    "### **Creating and activating a Virtual Environment**\n",
    "In order to install prefect, create a virtual environment:\n",
    "> `$ python -m venv .mlops_env`  \n",
    "\n",
    "Enter the Virtual Environment using below mentioned command:\n",
    "> `$ .mlops_env\\Scripts\\activate`\n",
    "\n",
    "***\n",
    "\n",
    "### **Installing Prefect 2.x**\n",
    "Now install Prefect:\n",
    "> `$ pip install prefect`  \n",
    "\n",
    "OR  if you have Prefect 1, upgrade to Prefect 2 using this command:  \n",
    "> `$ pip install -U prefect`  \n",
    "\n",
    "OR to install a specific version:  \n",
    "> `$ pip install prefect==2.4`  \n",
    "\n",
    "***\n",
    "\n",
    "### **Check Prefect Version**\n",
    "Check the prefect version:\n",
    "> `$ prefect version`\n",
    "\n",
    "***\n",
    "\n",
    "### **Running Prefect Dashboard (UI)**\n",
    "\n",
    "> `$ prefect server start`  \n",
    "\n",
    "```\n",
    " ___ ___ ___ ___ ___ ___ _____\n",
    "| _ \\ _ \\ __| __| __/ __|_   _|\n",
    "|  _/   / _|| _|| _| (__  | |\n",
    "|_| |_|_\\___|_| |___\\___| |_|\n",
    "\n",
    "Configure Prefect to communicate with the server with:\n",
    "\n",
    "    prefect config set PREFECT_API_URL=http://127.0.0.1:4200/api\n",
    "\n",
    "View the API reference documentation at http://127.0.0.1:4200/docs\n",
    "\n",
    "Check out the dashboard at http://127.0.0.1:4200\n",
    "```\n",
    "***\n",
    "\n",
    "**Note - In one of the earliest update of Prefect Orion, in Windows OS, if your path contains spaces, it will generate error (as mentioned below) when you try to run prefect orion. Sharing this so that you know what it is if you see it.**\n",
    "\n",
    "```\n",
    "___ ___ ___ ___ ___ ___ _____    ___  ___ ___ ___  _  _\n",
    "| _ \\ _ \\ __| __| __/ __|_   _|  / _ \\| _ \\_ _/ _ \\| \\| |\n",
    "|  _/   / _|| _|| _| (__  | |   | (_) |   /| | (_) | .` |\n",
    "|_| |_|_\\___|_| |___\\___| |_|    \\___/|_|_\\___\\___/|_|\\_|\n",
    "Configure Prefect to communicate with the server with:\n",
    "    prefect config set PREFECT_API_URL=http://127.0.0.1:4200/api\n",
    "View the API reference documentation at http://127.0.0.1:4200/docs\n",
    "Check out the dashboard at http://127.0.0.1:4200/\n",
    "Usage: uvicorn [OPTIONS] APP\n",
    "\n",
    "Try 'uvicorn --help' for help.\n",
    "\n",
    "Error: Got unexpected extra argument (prefect.orion.api.server:create_app)\n",
    "Orion stopped!\n",
    "```\n",
    "\n",
    "<img src=\"images/prefect_dashboard.JPG\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Refactoring the ML Workflow**\n",
    "\n",
    "If you've written the entire workflow code cell by cell in a Jupyter Notebook without explicitly defining functions, you may encounter difficulties when trying to visualize and monitor your flows in the Prefect dashboard.\n",
    "\n",
    "Prefect works best when workflows are organized into modular functions, with each function representing a task in your workflow. This allows Prefect to track task dependencies, visualize the workflow graph, and provide detailed execution logs and status updates in the dashboard.\n",
    "\n",
    "However, if you've written your workflow code directly in a Jupyter Notebook without defining functions, you can still use Prefect to run your workflows, but you may miss out on some of the dashboard's features and benefits.\n",
    "\n",
    "To address this, you can refactor your workflow code to extract each step into separate functions, and then import these functions into your notebook. This way, you can maintain the convenience of writing and experimenting with code in a notebook while also leveraging Prefect's capabilities for workflow orchestration and monitoring.\n",
    "\n",
    "Once you've refactored your code to use functions, you can run your flows as usual using Prefect's CLI or Python API, and you'll be able to visualize and monitor them in the Prefect dashboard."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "from sklearn.preprocessing import MinMaxScaler\n",
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "\n",
    "from sklearn import metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def load_data(file_path):\n",
    "    \"\"\"\n",
    "    Load data from a CSV file.\n",
    "    \"\"\"\n",
    "    return pd.read_csv(file_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def split_inputs_output(data, inputs, output):\n",
    "    \"\"\"\n",
    "    Split features and target variables.\n",
    "    \"\"\"\n",
    "    X = data[inputs]\n",
    "    y = data[output]\n",
    "    return X, y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def split_train_test(X, y, test_size=0.25, random_state=0):\n",
    "    \"\"\"\n",
    "    Split data into train and test sets.\n",
    "    \"\"\"\n",
    "    return train_test_split(X, y, test_size=test_size, random_state=random_state)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def preprocess_data(X_train, X_test, y_train, y_test):\n",
    "    \"\"\"\n",
    "    Rescale the data.\n",
    "    \"\"\"\n",
    "    scaler = MinMaxScaler()\n",
    "    X_train_scaled = scaler.fit_transform(X_train)\n",
    "    X_test_scaled = scaler.transform(X_test)\n",
    "    return X_train_scaled, X_test_scaled, y_train, y_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def train_model(X_train_scaled, y_train, hyperparameters):\n",
    "    \"\"\"\n",
    "    Training the machine learning model.\n",
    "    \"\"\"\n",
    "    clf = KNeighborsClassifier(**hyperparameters)\n",
    "    clf.fit(X_train_scaled, y_train)\n",
    "    return clf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluate_model(model, X_train_scaled, y_train, X_test_scaled, y_test):\n",
    "    \"\"\"\n",
    "    Evaluating the model.\n",
    "    \"\"\"\n",
    "    y_train_pred = model.predict(X_train_scaled)\n",
    "    y_test_pred = model.predict(X_test_scaled)\n",
    "\n",
    "    train_score = metrics.accuracy_score(y_train, y_train_pred)\n",
    "    test_score = metrics.accuracy_score(y_test, y_test_pred)\n",
    "    \n",
    "    return train_score, test_score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def workflow(data_path):\n",
    "    DATA_PATH = data_path\n",
    "    INPUTS = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']\n",
    "    OUTPUT = 'Species'\n",
    "    HYPERPARAMETERS = {'n_neighbors': 3, 'p': 2}\n",
    "    \n",
    "    # Load data\n",
    "    iris = load_data(DATA_PATH)\n",
    "\n",
    "    # Identify Inputs and Output\n",
    "    X, y = split_inputs_output(iris, INPUTS, OUTPUT)\n",
    "\n",
    "    # Split data into train and test sets\n",
    "    X_train, X_test, y_train, y_test = split_train_test(X, y)\n",
    "\n",
    "    # Preprocess the data\n",
    "    X_train_scaled, X_test_scaled, y_train, y_test = preprocess_data(X_train, X_test, y_train, y_test)\n",
    "\n",
    "    # Build a model\n",
    "    model = train_model(X_train_scaled, y_train, HYPERPARAMETERS)\n",
    "    \n",
    "    # Evaluation\n",
    "    train_score, test_score = evaluate_model(model, X_train_scaled, y_train, X_test_scaled, y_test)\n",
    "    \n",
    "    print(\"Train Score:\", train_score)\n",
    "    print(\"Test Score:\", test_score)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train Score: 0.9732142857142857\n",
      "Test Score: 0.9736842105263158\n"
     ]
    }
   ],
   "source": [
    "if __name__ == \"__main__\":\n",
    "    workflow(data_path=\"data/iris.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Building a Prefect Workflow**\n",
    "\n",
    "Step 1 - Import Prefect modules\n",
    "\n",
    "Step 2 - Define Prefect Tasks\n",
    "\n",
    "Step 3 - Define Prefect Flow\n",
    "\n",
    "Step 5 - Run Prefect Flow\n",
    "\n",
    "<img src=\"images/prefect_flow_run.JPG\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from prefect import task, flow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "@task\n",
    "def load_data(file_path):\n",
    "    \"\"\"\n",
    "    Load data from a CSV file.\n",
    "    \"\"\"\n",
    "    return pd.read_csv(file_path)\n",
    "\n",
    "\n",
    "@task\n",
    "def split_inputs_output(data, inputs, output):\n",
    "    \"\"\"\n",
    "    Split features and target variables.\n",
    "    \"\"\"\n",
    "    X = data[inputs]\n",
    "    y = data[output]\n",
    "    return X, y\n",
    "\t\n",
    "\n",
    "@task\n",
    "def split_train_test(X, y, test_size=0.25, random_state=0):\n",
    "    \"\"\"\n",
    "    Split data into train and test sets.\n",
    "    \"\"\"\n",
    "    return train_test_split(X, y, test_size=test_size, random_state=random_state)\n",
    "\t\n",
    "\t\n",
    "@task\n",
    "def preprocess_data(X_train, X_test, y_train, y_test):\n",
    "    \"\"\"\n",
    "    Rescale the data.\n",
    "    \"\"\"\n",
    "    scaler = MinMaxScaler()\n",
    "    X_train_scaled = scaler.fit_transform(X_train)\n",
    "    X_test_scaled = scaler.transform(X_test)\n",
    "    return X_train_scaled, X_test_scaled, y_train, y_test\n",
    "\t\n",
    "\n",
    "@task\n",
    "def train_model(X_train_scaled, y_train, hyperparameters):\n",
    "    \"\"\"\n",
    "    Training the machine learning model.\n",
    "    \"\"\"\n",
    "    clf = KNeighborsClassifier(**hyperparameters)\n",
    "    clf.fit(X_train_scaled, y_train)\n",
    "    return clf\n",
    "\t\n",
    "\n",
    "@task\n",
    "def evaluate_model(model, X_train_scaled, y_train, X_test_scaled, y_test):\n",
    "    \"\"\"\n",
    "    Evaluating the model.\n",
    "    \"\"\"\n",
    "    y_train_pred = model.predict(X_train_scaled)\n",
    "    y_test_pred = model.predict(X_test_scaled)\n",
    "\n",
    "    train_score = metrics.accuracy_score(y_train, y_train_pred)\n",
    "    test_score = metrics.accuracy_score(y_test, y_test_pred)\n",
    "    \n",
    "    return train_score, test_score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Workflow\n",
    "\n",
    "@flow(name=\"KNN Training Flow\")\n",
    "def workflow():\n",
    "    DATA_PATH = \"data/iris.csv\"\n",
    "    INPUTS = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']\n",
    "    OUTPUT = 'Species'\n",
    "    HYPERPARAMETERS = {'n_neighbors': 3, 'p': 2}\n",
    "    \n",
    "    # Load data\n",
    "    iris = load_data(DATA_PATH)\n",
    "\n",
    "    # Identify Inputs and Output\n",
    "    X, y = split_inputs_output(iris, INPUTS, OUTPUT)\n",
    "\n",
    "    # Split data into train and test sets\n",
    "    X_train, X_test, y_train, y_test = split_train_test(X, y)\n",
    "\n",
    "    # Preprocess the data\n",
    "    X_train_scaled, X_test_scaled, y_train, y_test = preprocess_data(X_train, X_test, y_train, y_test)\n",
    "\n",
    "    # Build a model\n",
    "    model = train_model(X_train_scaled, y_train, HYPERPARAMETERS)\n",
    "    \n",
    "    # Evaluation\n",
    "    train_score, test_score = evaluate_model(model, X_train_scaled, y_train, X_test_scaled, y_test)\n",
    "    \n",
    "    print(\"Train Score:\", train_score)\n",
    "    print(\"Test Score:\", test_score)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:42.939 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | prefect.engine - Created flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> for flow<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> 'KNN Training Flow'</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:42.939 | \u001b[36mINFO\u001b[0m    | prefect.engine - Created flow run\u001b[35m 'jumping-mamba'\u001b[0m for flow\u001b[1;35m 'KNN Training Flow'\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.068 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Created task run 'load_data-0' for task 'load_data'\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.068 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Created task run 'load_data-0' for task 'load_data'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.071 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Executing 'load_data-0' immediately...\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.071 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Executing 'load_data-0' immediately...\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.225 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Task run 'load_data-0' - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>()\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.225 | \u001b[36mINFO\u001b[0m    | Task run 'load_data-0' - Finished in state \u001b[32mCompleted\u001b[0m()\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.273 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Created task run 'split_inputs_output-0' for task 'split_inputs_output'\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.273 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Created task run 'split_inputs_output-0' for task 'split_inputs_output'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.277 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Executing 'split_inputs_output-0' immediately...\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.277 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Executing 'split_inputs_output-0' immediately...\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.413 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Task run 'split_inputs_output-0' - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>()\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.413 | \u001b[36mINFO\u001b[0m    | Task run 'split_inputs_output-0' - Finished in state \u001b[32mCompleted\u001b[0m()\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.463 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Created task run 'split_train_test-0' for task 'split_train_test'\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.463 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Created task run 'split_train_test-0' for task 'split_train_test'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.466 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Executing 'split_train_test-0' immediately...\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.466 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Executing 'split_train_test-0' immediately...\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.601 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Task run 'split_train_test-0' - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>()\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.601 | \u001b[36mINFO\u001b[0m    | Task run 'split_train_test-0' - Finished in state \u001b[32mCompleted\u001b[0m()\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.648 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Created task run 'preprocess_data-0' for task 'preprocess_data'\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.648 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Created task run 'preprocess_data-0' for task 'preprocess_data'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.651 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Executing 'preprocess_data-0' immediately...\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.651 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Executing 'preprocess_data-0' immediately...\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.783 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Task run 'preprocess_data-0' - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>()\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.783 | \u001b[36mINFO\u001b[0m    | Task run 'preprocess_data-0' - Finished in state \u001b[32mCompleted\u001b[0m()\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.831 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Created task run 'train_model-0' for task 'train_model'\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.831 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Created task run 'train_model-0' for task 'train_model'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.834 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Executing 'train_model-0' immediately...\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.834 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Executing 'train_model-0' immediately...\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:43.977 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Task run 'train_model-0' - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>()\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:43.977 | \u001b[36mINFO\u001b[0m    | Task run 'train_model-0' - Finished in state \u001b[32mCompleted\u001b[0m()\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:44.031 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Created task run 'evaluate_model-0' for task 'evaluate_model'\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:44.031 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Created task run 'evaluate_model-0' for task 'evaluate_model'\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:44.034 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Executing 'evaluate_model-0' immediately...\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:44.034 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Executing 'evaluate_model-0' immediately...\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:44.188 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Task run 'evaluate_model-0' - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>()\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:44.188 | \u001b[36mINFO\u001b[0m    | Task run 'evaluate_model-0' - Finished in state \u001b[32mCompleted\u001b[0m()\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train Score: 0.9732142857142857\n",
      "Test Score: 0.9736842105263158\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">15:11:44.262 | <span style=\"color: #008080; text-decoration-color: #008080\">INFO</span>    | Flow run<span style=\"color: #800080; text-decoration-color: #800080\"> 'jumping-mamba'</span> - Finished in state <span style=\"color: #008000; text-decoration-color: #008000\">Completed</span>('All states completed.')\n",
       "</pre>\n"
      ],
      "text/plain": [
       "15:11:44.262 | \u001b[36mINFO\u001b[0m    | Flow run\u001b[35m 'jumping-mamba'\u001b[0m - Finished in state \u001b[32mCompleted\u001b[0m('All states completed.')\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "if __name__ == \"__main__\":\n",
    "    workflow()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **Make Your Workflow Schedulable**\n",
    "\n",
    "There are two way I will discuss using which you can auto schedule your workflow.\n",
    "\n",
    "**Note:** For both the implementations, make sure you have moved your prefect `tasks` and `flows` to `.py` files.\n",
    "\n",
    "1. Using Serve Function\n",
    "\n",
    "```python\n",
    "if __name__ == \"__main__\":\n",
    "    workflow.serve(\n",
    "        name=\"my-first-deployment\",\n",
    "        cron=\"* * * * *\"\n",
    "    )\n",
    "```\n",
    "2. Using Deploy Function\n",
    "```python\n",
    "# For this implementation, you should ensure that you create the 'work_pool_name'\n",
    "# This can be done in the Prefect Dashboard by navigating to 'Work Pool'\n",
    "if __name__ == \"__main__\":\n",
    "    workflow.deploy(\n",
    "        name=\"my-first-deployment\",\n",
    "        cron=\"* * * * *\",\n",
    "        work_pool_name=\"local-work-pool\"\n",
    "    )\n",
    "```\n",
    "\n",
    "### **What is `cron`?**\n",
    "'cron' refers to a scheduling expression that defines the frequency at which a flow should be executed. It follows a specific syntax that allows you to specify recurring time intervals for running your flow.\n",
    "\n",
    "A cron expression consists of five fields separated by spaces:\n",
    "\n",
    "1. Minute (0 - 59)\n",
    "2. Hour (0 - 23)\n",
    "3. Day of the month (1 - 31)\n",
    "4. Month (1 - 12)\n",
    "5. Day of the week (0 - 7) (Sunday is 0 or 7)\n",
    "\n",
    "For example, a cron expression of \"0 1 * * *\" means the flow will run every day at 1:00 AM.\n",
    "\n",
    "**[Click here](https://crontab.guru/) to experiment.**  \n",
    "\n",
    "Here are a few more examples of cron expressions:\n",
    "\n",
    "- \"0 * * * *\" (run every hour at the beginning of the hour)\n",
    "- \"0 0 * * 0\" (run every Sunday at midnight)\n",
    "- \"0 8-18 * * *\" (run every hour between 8:00 AM and 6:00 PM)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
